Routing scanned documents with scanned control sheets

ABSTRACT

A scanning system routes scanned document information ( 110 ) to a specified location ( 120 ) based on scanned control sheet information ( 108 ). Each location ( 120 ) is associated with an existing identifier ( 118 ). Scanned control sheet information ( 108 ) is retrieved by the scanning system from graphical information displayed on a control sheet ( 102 ). The system compares a tentative identifier ( 124 ) obtained from the scanned control sheet information ( 108 ) to existing identifiers ( 118 ) to determine a location ( 120 ) to which scanned document information ( 110 ) should be routed.

FIELD OF INVENTION

[0001] This invention pertains to the field of routing scanned documentinformation. More specifically, this invention pertains to the use ofscanned control sheets to route scanned document information to anexisting location.

BACKGROUND OF THE INVENTION

[0002] Scanners are commonly used in business enterprises and otherorganizations to convert paper documents into electronic form. Becausescanners are expensive, complex pieces of equipment, it is common formany persons in an organization to share the use of a single scanner.Typically, the scanned document information generated when a user scansa document is stored in a default location. When the scanner is attachedto a computer network, the scanned document information may ultimatelybe moved over the network to a desired location on the network, forinstance a particular sub-directory of a user's file directory. In orderto do this, however, the user needs to interact with a computer on thenetwork after the document has been scanned.

[0003] Rather than putting scanned document information in a defaultstorage area until claimed by someone on the network, a scanner mayallow a user to enter a desired destination prior to scanning. Then thescanned document information can be routed directly to the desireddestination, without further user intervention.

[0004] With either of these conventional systems, however, a user maynot place a number of separate documents, each with a separatedestination, into the scanner and expect the scanned documentinformation to arrive at the correct locations without furtherintervention. When using a system of the first type, the user will needto later use a computer on the network to move the scanned documentinformation to the appropriate locations. With the second type ofsystem, the user will need to enter each separate destination into thescanner prior to the scanning of each document. Even though it may takea while for the scanner to work its way through each document, the userwill typically need to wait for the scanner to finish scanning eachdocument in order to enter the next destination.

[0005] What is needed is a scanning system which allows a user tocommunicate a desired destination for a scanned document in a way whichallows the destination information to stay with the physical document.This would allow a set of documents, each with a unique destination, tobe scanned and routed automatically, without further user intervention.This would also allow documents to be routed without the user having tointeract with a computer.

SUMMARY OF THE INVENTION

[0006] The present invention is a system and method for directing therouting of scanned document information (110) with control sheets (102).A control sheet (102) is typically a piece of paper with graphicalinformation on it. This information indicates to the system where thescanned document information (110) should be routed. It may be in theform of human-readable writing, it may be in the form ofmachine-readable markings, or it may be a combination of the two. Thesystem retrieves this information by scanning the control sheet (102).In one embodiment, the invention routes the scanned document information(110) to the destination (120) which most nearly matches the scannedcontrol sheet information (108). This allows for proper operation in thecase of minor errors in the analysis of the control sheet information(108).

BRIEF DESCRIPTION OF THE DRAWINGS

[0007]FIG. 1 is an illustration of one embodiment of the presentinvention.

[0008]FIG. 2 is a flowchart illustrating the operation of one embodimentof the present invention.

[0009]FIG. 3 is a diagram illustrating the use of existing identifiers118 and tentative identifiers 124 to determine document identifiers 126.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0010] Referring now to FIG. 1, the operation of one embodiment of thepresent invention is shown to include scanner 106. Document 104 to bescanned and routed to a particular one of several locations 120 is fedinto scanner 106 with control sheet 102. Control sheet 102 of thedescribed embodiment is a piece of paper. In alternate embodimentscontrol sheet 102 may be any object which can be scanned by scanner 106,including transparencies, cardboard, etc. Control sheet 102 displaysinformation in a graphical form which may be detected by scanner 106. Inthe described embodiment, this information takes the form ofhuman-readable text. In an alternate embodiment, this information may bein a form other than a human-readable form, such as conventional barcodes. Control sheet 102 may be fed into scanner 106 either immediatelyprior or immediately following document 104 being fed into scanner 106.Scanner 106 can include modes, selectable by the user, which indicatewhether control sheet 102 precedes or follows document 104. Alternately,information contained in control sheet 102 can indicate whether it isassociated with the preceding or following document 104.

[0011]FIG. 2 is a flowchart of a method for practicing the invention.The modules which make up the steps of the flowchart can be implementedin hardware, firmware, software or any combination thereof. Referringnow to FIGS. 1 and 2, when scanner 106 scans 202 control sheet 102,scanned control sheet information 108 is produced. Scanned control sheetinformation 108 is an electronic image of the graphical informationdisplayed by control sheet 102, and is stored in scan storage memory112. Scan storage memory 112 is a digital memory device which is capableof being written to and read from. Central processing unit (CPU) 114,which can reside either within scanner 106 or outside scanner 106, readsscanned control sheet information 108 out of scan storage memory 112 andinterprets it using Optical Character Recognition (OCR) 206. OCR is aconventional process by which human-readable graphical characters areconverted into machine-readable digital information. Thesehuman-readable characters may be either handwritten or machine produced.Alternatively, CPU 114 may use other methods of image analysis todetermine machine-readable digital information from scanned controlsheet information 108. Other forms of machine-readable informationinclude bar-codes and dot-patterns. Such machine-readable informationcan include checksum information to increase the accuracy of the scannedcontrol sheet information 108.

[0012] The information extracted from scanned control sheet information108 in step 206 is tentative identifier 124. Tentative identifier 124 isused to determine the appropriate document identifier 126 forinformation from document 104. Document identifier 126 is an identifierwhich is associated with a scanned document and with the location 120 towhich the scanned document information 110 is to be routed. CPU 114accesses 208 a list of existing identifiers 118, each of which isassociated with a particular location 120. There are an arbitrarynumber, m, of existing identifiers 118 and locations 120. In alternateembodiments of the present invention the number of existing identifiers118 and the number of locations 120 may be different. For example, anexisting identifier 118 could be associated with more than one location120, and a location 120 could be associated with more than one existingidentifier 118.

[0013] After accessing 208 the list of existing identifiers 118, CPU 114compares 210 tentative identifier 124 to the list of existingidentifiers 118, to determine 212 whether any existing identifier 118matches tentative identifier 124. If an existing identifier 118 doesmatch tentative identifier 124, document identifier 126 is set to thattentative identifier 224. Otherwise, a fuzzy matching method is used 214to determine whether any existing identifier 118 is similar enough totentative identifier 124 to be considered a match. Fuzzy matchingencompasses all non-literal matching methods.

[0014] An example of a fuzzy matching method which can be used is thatdescribed in U.S. Pat. No. 5,600,835 to Harry T. Garland et al., whichis incorporated by reference herein in its entirety. This fuzzy matchingmethod compares two character strings and generates a “dissimilarityvalue,” which is a measure of how different the character strings are.In step 214, a dissimilarity value is computed for each existingidentifier 118, as compared to the tentative identifier 124. CPU 114then compares 216 each generated dissimilarity value to a predeterminedthreshold, in order to determine whether any dissimilarity value islesser than the threshold 218. If no dissimilarity value is lesser thanthe threshold, document identifier 126 is set 220 to an identifierassociated with a location for documents with unrecognized tentativeidentifiers. If there is an existing identifier 118 for which thedissimilarity value is lesser than the threshold, the documentidentifier 126 is set 222 to equal the existing identifier 118 with thesmallest dissimilarity value. As illustrated in FIG. 3, tentativeidentifiers 124 which exactly or nearly match one of the existingidentifiers 118 cause the document identifier 126 to be set to thatexisting identifier 118. Those tentative identifiers 124 which do notnearly match an existing identifier 118 (such as “Walter” in FIG. 3)result in a document identifier 126 which is used for unrecognizedtentative identifiers 124 (“Unrecognized,” in FIG. 3). All unrecognizeddocument information 110 is routed to a location 120 for such “lost”document information 110. In the illustrative embodiment, the thresholdvalue is either a default value, or is set by the user of the systemthrough a user interface. There are other methods known and available tothose skilled-in-the-art for performing fuzzy matching. Thus, any methodfor fuzzy matching may be incorporated into the inventive system.

[0015] After the document identifier 126 has been set in any of steps220, 222, or 224, the document 104 is scanned 226 into scanner 106,resulting in scanned document information 110. As described earlier,this step 226 could instead take place prior to the scanning 202 ofcontrol sheet 102. Scanned document information 110 is stored in scanstorage memory 112. In an alternate embodiment, scanned control sheetinformation 108 and scanned document information 110 are stored inseparate scan storage memories 112. CPU 114 then transfers 228 scanneddocument information 110 to the location 120 associated with documentidentifier 126. There are many known methods for routing documentinformation 110 to an identified location 120.

[0016] The use of the present invention allows for scanned documentinformation 110 to be centrally directed. For example, a worker mayreceive a work order for a particular job. This work order, whilecommunicating to the worker what is to be done, might also includemachine-readable identification, and be a control sheet 102. Aftercompleting the job, which includes producing or retrieving documentswhich need to be scanned, the worker puts the documents 104 and thecontrol sheet 102 into a scanning system which operates in accordancewith the present invention. Because the control sheet 102 is specific tothe job, it can route scanned document information 110 to a location 120which is also job-specific. Such a system would be useful to personssuch as insurance adjusters, who need to retrieve, scan, and storecase-specific documents 104 which might already exist in paper form.

[0017] Because scanning is non-destructive, this invention also allowsusers to keep a few control sheets 102 for repeated use. One controlsheet 102 could be for personal documents, while others might be clientspecific. Any time a document 104 in one of these categories is to bescanned, the appropriate control sheet 102 would be included, to ensurethe scanned document information 110 is routed to the proper location120.

[0018] The above description is included to illustrate the operation ofan exemplary embodiment and is not meant to limit the scope of theinvention. The scope of the invention is to be limited only by thefollowing claims. From the above description, many variations will beapparent to one skilled in the art that would yet be encompassed by thespirit and scope of the present invention.

What is claimed is:
 1. A method for routing scanned document informationto at least one of a plurality of locations based on graphical contentof a control sheet, which locations are each associated with at leastone of a plurality of existing identifiers, and which control sheetdisplays information in a graphical form, the method comprising thesteps of: scanning the control sheet to retrieve scanned control sheetinformation therefrom; determining a tentative identifier from thecontrol sheet information; comparing the tentative identifier to theexisting identifiers; scanning a document which is not a control sheet,resulting in scanned document information; and responsive to thetentative identifier matching one of the existing identifiers, routingthe scanned document information to a location associated with thatexisting identifier.
 2. The method of claim 1 , wherein each locationassociated with an existing identifier is at least one of a computerfile folder, a computer file directory, and an entry in a databaselinking the location to information necessary to retrieve the scanneddocument information.
 3. The method of claim 1 , wherein the step ofdetermining a tentative identifier from the control sheet informationcomprises the sub-steps of: performing Optical Character Recognition(OCR) on the control sheet information; and determining a tentativeidentifier from the results of the OCR.
 4. The method of claim 1 ,further comprising the steps of: responsive to the tentative identifiernot matching any of the existing identifiers, determining whether thereis an existing identifier which exhibits a desired level of similarityto the tentative identifier; and responsive to a determination that anexisting identifier exhibits the desired level of similarity to thetentative identifier, routing the scanned document information to alocation associated with an existing identifier which exhibits thedesired level of similarity to the tentative identifier.
 5. The methodof claim 4 , wherein routing the scanned document information to alocation associated with an existing identifier which exhibits thedesired level of similarity to the tentative identifier comprises thesub-step of: routing the scanned document information to a locationassociated with the existing identifier which is most similar to thetentative identifier.
 6. The method of claim 4 , further comprising thestep of: responsive to a determination that no existing identifierexhibits the desired level of similarity to the tentative identifier,routing the scanned document information to a location for scanneddocument information with unrecognized tentative identifiers.
 7. Themethod of claim 4 , wherein the step of determining whether there is anexisting identifier which exhibits the desired level of similarity tothe tentative identifier comprises the sub-steps of: using a matchingmethod to determine for each of the existing identifiers a dissimilaritymetric which is a measurement of the dissimilarity between the tentativeidentifier and that existing identifier; and comparing the dissimilaritymetric which indicates the least dissimilarity to a dissimilaritythreshold to determine whether the existing identifier associated withthat dissimilarity metric exhibits the desired level of similarity tothe tentative identifier.
 8. The method of claim 7 , further comprisingthe step of: responsive to a determination that no existing identifierexhibits the desired level of similarity to the tentative identifier,routing the scanned document information to a location for scanneddocument information with unrecognized tentative identifiers.
 9. Themethod of claim 1 , wherein the scanning of the control sheet and thescanning of the document both utilize the same scanning device.
 10. Themethod of claim 9 , wherein the scanning of the control sheet occursprior to the scanning of the document, and no other documents arescanned after the scanning of the control sheet and prior to thescanning of the document.
 11. The method of claim 9 , wherein thescanning of the document occurs prior to the scanning of the controlsheet, and no other documents are scanned after the scanning of thedocument and prior to the scanning of the control sheet.
 12. A documentrouting device for routing scanned document information to at least oneof a plurality of locations based on graphical content of a controlsheet, which locations are each associated with at least one of aplurality of existing identifiers, and which control sheet displaysinformation in a graphical form, the device comprising: a scan storagememory coupled to a scanner for storing scanned control sheetinformation and scanned document information; a central processing unit(CPU) coupled to the scan storage memory; a program memory, coupled tothe CPU, and storing a set of instructions, which, when executed by theCPU, cause the CPU to: access control sheet information from the scanstorage memory; determine a tentative identifier from the control sheetinformation; compare the tentative identifier to the existingidentifiers; access scanned document information from the scan storage;and responsive to the tentative identifier matching one of the existingidentifiers, route the scanned document information to a locationassociated with that existing identifier.
 13. The device of claim 12 ,wherein each location associated with an existing identifier is at leastone of a computer file folder, a computer file directory, and an entryin a database linking the location to information necessary to retrievethe scanned document information.
 14. The device of claim 12 , whereinthe step of determining a tentative identifier from the control sheetinformation comprises the sub-steps of: performing Optical CharacterRecognition (OCR) on the control sheet information; and determining atentative identifier from the results of the OCR.
 15. The device ofclaim 12 , wherein the array of instructions, when executed by the CPU,cause the CPU to: responsive to the tentative identifier not matchingany of the existing identifiers, and responsive to an existingidentifier exhibiting the desired level of similarity to the tentativeidentifier, route the scanned document information to a locationassociated with an existing identifier which exhibits the desired levelof similarity to the tentative identifier.
 16. The device of claim 15 ,wherein routing the scanned document information to a locationassociated with an existing identifier which exhibits the desired levelof similarity to the tentative identifier comprises the step of: routingthe scanned document information to a location associated with theexisting identifier which is most similar to the tentative identifier.17. The device of claim 15 , wherein the array of instructions, whenexecuted by the CPU, cause the CPU to: responsive to a determinationthat no existing identifier exhibits the desired level of similarity tothe tentative identifier, route the scanned document information to alocation for scanned document information with unrecognized tentativeidentifiers.
 18. The device of claim 15 , wherein determining whetherthere is an existing identifier which exhibits the desired level ofsimilarity to the tentative identifier comprises the steps of: using amatching method to determine for each of the existing identifiers adissimilarity metric which is a measurement of the dissimilarity betweenthe tentative identifier and that existing identifier; and comparing thedissimilarity metric which indicates the least dissimilarity to adissimilarity threshold to determine whether the existing identifierassociated with that dissimilarity metric exhibits the desired level ofsimilarity to the tentative identifier.
 19. The device of claim 18 ,wherein the array of instructions, when executed by the CPU, cause theCPU to: responsive to a determination that no existing identifierexhibits the desired level of similarity to the tentative identifier,route the scanned document information to a location for scanneddocument information with unrecognized tentative identifiers.